The Stochastic Segment Model for Continuous Speech Recognition

نویسندگان

  • M. Ostendorf
  • V. Digalakis
چکیده

A new direction in speech recognition via statistical methods is to move from frame-based models, such as Hidden Markov Models (HMMs), to segment-based models that provide a better framework for model-ing the dynamics of the speech production mechanism. The Stochastic Segment Model (SSM) is a joint model for a sequence of observations, which provides explicit modeling of time correlation as well as a formalism for incorporating segmental features. In this work, the focus is on modeling time correlation within a segment. We consider three Gaussian model variations based on diierent assumptions about the form of statistical dependency, including a Gauss-Markov model, a dynamical system model and a target state model, all of which can be formulated in terms of the dynamical system model. Evaluation of the diierent modeling assumptions is in terms of both phoneme classiication performance and the predictive power of linear models .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improvements in the Stochastic Segment Model for Phoneme Recognition

The heart of a speech recognition system is the acoustic model of sub-word units (e.g., phonemes). In this work we discuss refinements of the stochastic segment model, an alternative to hidden Markov models for representation of the acoustic variability of phonemes. We concentrate on mechanisms for better modelling time correlation of features across an entire segment. Results are presented for...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation

A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...

متن کامل

Improved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition

Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...

متن کامل

Stochastic trajectory model with state-mixture for continuous speech recognition

The problem of acoustic modeling for continuous speech recognition is addressed. To deal with coarticulation effects and interspeaker variability, an extension of the Mixture Stochastic Trajectory Model (MSTM) is proposed. MSTM is a segment-based model using phonemes as speech units. In MSTM, the observations of a phoneme are modeled by a set of stochastic trajectories. The trajectories are mod...

متن کامل

Speech Recognition Using a Discriminative , Context - Independent , Segment - Based SpeechRecognizerJan

| In this paper, we describe important improvements that were recently introduced in our Discriminative Stochastic Segment Model (DSSM) speech recognizer. We propose a new presegmen-tation algorithm and we optimize the structure of the Multi-Layer Perceptron (MLP) that estimates the phone probabilities. Additionally, we describe a cascade MLP combination technique that relaxes the drawbacks of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1991